Example of Selecting Lag Distance
Let’s say you have a dataset of air quality measurements in a city, and you want to analyze the spatial relationship between stations. Here’s how you might proceed:
Examine the study area: If your data spans an area of 10 km by 10 km, then a lag distance of 500 meters or 1 km might be reasonable.
Check for spatial autocorrelation: Create a variogram to see how the data correlates over distances. You’ll likely see that values are highly correlated at short distances (e.g., 100–500 meters) but become less correlated as the distance grows.
Set lag intervals: Based on your variogram and knowledge, you might decide on breaks such as:
0–500 meters
500–1000 meters
1000–2000 meters
2000–5000 meters
Adjust based on the data: If you find that most of your points are clustered closely together, you may select smaller lags. If the points are spread out, you may select larger lags.
The reason for The transition from an h-scatterplot to a variogram cloud
h-Scatterplot: This plot shows the relationship between values of a variable at different distances (lags) in a scatterplot format. While it is useful to get a rough idea of the spatial relationship, it doesn’t provide a detailed understanding of how spatial dependence varies with distance.
Variogram Cloud: The variogram cloud (or experimental variogram) provides a more refined and systematic view of spatial dependence. It plots the variance of the differences between values at pairs of locations (the semivariance) against the distance between those locations.
Variogram Cloud
A variogram cloud shows the semivariance for every pair of spatial locations in the dataset, at different distances (lags). It provides valuable information, but it can be noisy, especially in large datasets. Each point in the cloud represents a pair of points at a specific distance, but there may be a lot of variability due to:
Sampling errors,
Data sparsity in certain ranges,
Local irregularities.
This noise can make it harder to interpret the spatial structure of the data clearly.
Binned Variogram
To address this, the variogram cloud is often binned into predefined distance intervals (lags). This process groups the semivariance values for pairs of points that fall within certain distance ranges, effectively averaging them. This reduces the noise and results in a smoother, more stable variogram. The binned variogram provides a clearer representation of the overall trend in spatial autocorrelation, particularly for larger datasets.
Improving the Interpretation of Spatial Structure
Variogram Cloud: The cloud may contain too many points to easily discern patterns or trends in the spatial structure. The relationship between distance and semivariance may be obscured by the sheer number of points and their variability.
Binned Variogram: By binning the data, you get a smoothed representation of how semivariance increases with distance. This makes it easier to identify key features of the spatial structure, such as:
The range: The distance at which spatial dependence effectively disappears.
The nugget effect: The value of the semivariance at very short distances, indicating local variability or measurement error.
The sill: The maximum value of the semivariance, beyond which no further increase in variability is observed.
The binned variogram helps in understanding the overall trend in the spatial dependence between locations.
Interpretations of Variograms
\[C(h) = Sill - \gamma(h)\]
Relationship between a correlogram and a variogram
In class explanations
Further reading: https://csegrecorder.com/articles/view/the-variogram-basics-a-visual-intro-to-useful-geostatistical-concepts